fix(kg): restore risk node layer — persist risk-summary.json + parse exposure_by_category (#231)#232
Merged
Conversation
Root-cause + remediation plan for the KG dropping the entire risk node layer on current-format sessions: risk-summary.json never persisted + Phase 7 parser keyed on the legacy risk_categories schema. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move the Phase 7 risk-summary JSON parsing into an exported, side-effect-free function so the schema handling is unit-testable in isolation. Byte-equivalent behavior for the legacy risk_categories/categories schema (existing 33 kg-phase tests stay green); no functional change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The current risk-aggregator emits risk-summary.json keyed on exposure_by_category
with string exposures ("$433.75M") and string probability ("8% fail"); the parser
only handled the legacy risk_categories numeric schema, yielding 0 risk nodes.
- Add exposure_by_category as the primary category source (legacy keys preserved).
- Synthesize $-amounts from weighted_exposure/exposure_low/high when no numeric
p10/p50/p90 bits exist (legacy numeric path unchanged).
- Pass string probability through (already carries % for the downstream regex).
- Emit Exposure BEFORE Mitigation so amounts.slice(0,5) leads with real exposures,
not the RRTF figure embedded in mitigation prose.
Fixes #231 (parser half).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contract anchor for #231: pins both the legacy risk_categories numeric schema and the current exposure_by_category string schema, plus the exposure-before-mitigation ordering, malformed-JSON fallback, and precedence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The live PostToolUse hook only persisted .md files, so the risk-aggregator's risk-summary.json never reached the reports table and the KG risk layer + the CRITICAL_REPORTS gate (risk-summary) silently failed. - Add JSON_REPORT_FILENAMES allowlist (hookDBBridgeConfig.js) — exact basenames, NOT all .json, so *-state.json / banker-*.json / entities.json are excluded. - Broaden the persist gate to the allowlist. persistReport is content-agnostic; extractReportType maps /review-outputs/ → review and extractReportKey strips .json → report_key 'risk-summary'. Fail-soft (all DB writes are try/caught). Fixes #231 (live-persistence half). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bring local backfill to parity with live persistence for recovery of unpersisted sessions: - walkMarkdown now also ingests allowlisted JSON deliverables (risk-summary.json) so the KG risk layer can be rebuilt (#231); scoped by basename, state sidecars excluded. - Scan review-outputs/ (not just qa-outputs/) for *-state.json, so fact-validator / coverage-gap-analyzer / risk-aggregator states are captured. - (Carries the banker_intake/banker_qa report-type matchers added during session recovery so banker artifacts get their canonical types.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
One-command recovery for a completed local session directory that never got a sessions row: bootstraps the row (idempotent), derives transaction_name from banker-deal-context.json, delegates to backfill-local-to-db.mjs, and optionally fires the admin rebuild endpoints. Closes the gap that backfill-local-to-db.mjs aborts when no sessions row exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Local, faithful reproduction of POST /api/admin/sessions/:key/rebuild-kg for when no server / admin JWT is available: entity-synthesis + citation-synthesis pre-steps then buildSessionKnowledgeGraph, honoring BANKER_QA_OUTPUT for the banker KG phases. Upsert-only (mirrors the endpoint); --clean is opt-in. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
/#2) Adversarial review found the node loop re-regexed the rendered block, so mitigation $-figures leaked into exposure_amounts (R-ANT-001 captured the $1,237,262,000 RRTF) and the title's % masqueraded as probability (65-80% instead of the 8% fail prob; 53.8%→8% decimal truncation). - buildRiskBlocksFromJson now emits structured exposureAmounts (from exposure bits only) and probability (first %-token of the probability field, else null). - The node loop prefers these; markdown Path B (no structured fields) still falls back to whole-block regex — unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tion Pins audit findings #1/#2 with faithful R-ANT-001 shapes: mitigation RRTF figure must not enter exposureAmounts; probability comes from the probability field not a title %; non-quantified probability → null; legacy numeric path unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
9 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #231 — the knowledge graph dropped the entire
risknode layer for every current-format session. Two independent breaks, both required for a risk node to exist, plus an adversarial-audit remediation of extraction quality.Root causes
review-outputs/risk-summary.json, but the live PostToolUse hook and the local backfill only persisted.md, so it never reachedreportsand the KGCRITICAL_REPORTSgate (risk-summary) timed out.risk_categoriesnumeric schema; the current producer emitsexposure_by_categorywith string exposures ("$433.75M") and string probability ("8% fail").Changes (additive, non-destructive)
JSON_REPORT_FILENAMESallowlist (hookDBBridgeConfig.js) — exact basenames (risk-summary.json), NOT all.json; consulted by the live hook (hookDBBridge.js) and the backfill.*-state.json/banker-*.json/entities.jsonexcluded.buildRiskBlocksFromJson()extracted as a pure, exported, unit-tested function; extended toexposure_by_category+ string exposures with the legacy numeric path preserved.exposureAmounts/probability; the node loop prefers them so mitigation$-figures no longer leak intoexposure_amountsand a title%no longer masquerades as probability. Markdown Path B unchanged.review-outputs/for*-state.json(pre-existing gap), and carries the banker report-type matchers.scripts/restore-unpersisted-session.mjs,scripts/rebuild-kg-local.mjs.Adversarial audit + remediation
An independent reviewer ran the real
risk-summary.jsonthrough the parser and found two HIGH defects (mitigation-figure leak; wrong probability). Both were remediated structurally and pinned with real-shape regression tests (test/sdk/kg-phase7-risk-parser.test.js, 13 cases). Phase 13 intentionally untouched (requires numericp10/p50/p90; cleanly skips string findings).Validation
node --test), incl. legacy + current schema + the two audit regressions.risk-summary.json→ 11 risk blocks (was 0), 0 mitigation leaks, correct probabilities.2026-06-16-1781644875): backfilledrisk-summaryreport → re-ran KG → 11risknodes with cleanexposure_amounts/probability(0 leaks) → reapplied embeddings (34 reports incl. risk-summary).Notes
kgHelpersalias unchanged.documentConverteralready excludesrisk-summary.json(no DOCX/PDF). All DB writes fail-soft.docs/pending-updates/kg-risk-layer-fix-231.md.🤖 Generated with Claude Code